A new look at generating multi-join continuous query plans: A qualified plan generation problem
نویسندگان
چکیده
State-of-the-art relational and continuous algorithms alike have focused on producing optimal or near-optimal query plans by minimizing a single cost function. However, ensuring accurate yet real-time responses for stream processing applications necessitates that the system identifies qualified rather than optimal query plans with the former guaranteeing that their utilization of both the CPU and the memory resources stays within their respective system capacities. In such scenarios, being optimal in one resource usage while out-of-bound in the other is not viable. Our experimental study illustrates that to be effective a qualified plan optimizer must explore an extended plan search space called the jtree space composed not only of the standard mjoin and binary join plans, but also of general join trees with mixed operator types. While our proposed dynamic programming-based JTree-Finder algorithm is guaranteed to generate a qualified query plan if such a plan exists in the search space, its exponential time complexity makes it not viable for continuous stream environments. To facilitate run-time optimization, we thus propose an efficient yet effective two-layer plan generation framework. The proposed framework first exploits the positive correlation between the CPU and memory usages to obtain plans that are minimal in at least one of the two resource usages. In our second layer we propose two alternative polynomial-time algorithms to explore the negative correlation between the resource usages to successfully generate query plans that adhere to both CPU and memory resource constraints. Effectiveness and efficiency of the proposed algorithms are experimentally evaluated by comparing them to each other as well as state-of-the-art techniques.
منابع مشابه
Multi-Join Continuous Query Optimization: Covering the Spectrum of Linear, Acyclic, and Cyclic Queries
Abstract. Traditional optimization algorithms that guarantee optimal plans have exponential time complexity and are thus not viable in streaming contexts. Continuous query optimizers commonly adopt heuristic techniques such as Adaptive Greedy to attain polynomial-time execution. However, these techniques are known to produce optimal plans only for linear and star shaped join queries. Motivated ...
متن کاملMonitoring Stream Properties for Continuous Query Processing
We are developing a general-purpose Data Stream Management System for processing continuous queries over multiple continuous data streams [MW 03]. When a new continuous query is registered, our query optimizer creates an initial query plan (possibly merged with existing plans for previously registered queries), and allocates initial resources, such as memory for join or aggregation synopses [GG...
متن کاملBatWAn: A Binary and Multi-Way Query Plan Analyzer
The majority of existing SPARQL query engines generate query plans composed of binary join operators. Albeit effective, binary joins can drastically impact on the performance of query processing whenever source answers need to be passed through multiple operators in a query plan. Multi-way joins have been proposed to overcome this problem; they are able to propagate and generate results in a si...
متن کاملParallelizing query optimization
Many commercial RDBMSs employ cost-based query optimization exploiting dynamic programming (DP) to efficiently generate the optimal query execution plan. However, optimization time increases rapidly for queries joining more than 10 tables. Randomized or heuristic search algorithms reduce query optimization time for large join queries by considering fewer plans, sacrificing plan optimality. Thou...
متن کاملSemantic Query Optimization for Query Plans of Heterogeneous Multidatabase Systems
New applications of information systems, such as electronic commerce and healthcare information systems, need to integrate a large number of heterogeneous databases over computer networks. Answering a query in these applications usually involves selecting relevant information sources and generating a query plan to combine the data automatically. As signi cant progress has been made in source se...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Data Knowl. Eng.
دوره 69 شماره
صفحات -
تاریخ انتشار 2010